Solving Continuous-Time Transition-Independent DEC-MDP with Temporal Constraints
نویسندگان
چکیده
Despite the impact of DEC-MDPs over the past decade, scaling to large problem domains has been difficult to achieve. The scale-up problem is exacerbated in DEC-MDPs with continuous states, which are critical in domains involving time; the latest algorithm (M-DPFP) does not scale-up beyond two agents and a handful of unordered tasks per agent. This paper is focused on meeting this challenge in continuous resource DEC-MDPs with two predominant contributions. First, it introduces a novel continuous time model for multi-agent planning problems that exploits transition independence in domains with graphical agent dependencies and temporal constraints. More importantly, it presents a new, iterative, locally optimal algorithm called SPAC that is a combination of the following key ideas: (1) defining a novel augmented CT-MDP such that solving this singleagent continuous time MDP provably provides an automatic best response to neighboring agents’ policies; (2) fast convolution to efficiently generate such augmented MDPs; (3) new enhanced lazy approximation algorithm to solve these augmented MDPs; (4) intelligent seeding of initial policies in the iterative process; (5) exploiting graph structure of reward dependencies to exploit local interactions for scalability. Our experiments show SPAC not only finds solutions substantially faster than M-DPFP with comparable quality, but also scales well to large teams of agents.
منابع مشابه
Continuous Time Planning for Multiagent Teams with Temporal Constraints
Continuous state DEC-MDPs are critical for agent teams in domains involving resources such as time, but scaling them up is a significant challenge. To meet this challenge, we first introduce a novel continuous-time DEC-MDP model that exploits transition independence in domains with temporal constraints. More importantly, we present a new locally optimal algorithm called SPAC. Compared to the be...
متن کاملExploiting Separability in Multiagent Planning with Continuous-State MDPs (Extended Abstract)
Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general model for decision-making under uncertainty in cooperative decentralized settings, but are difficult to solve optimally (NEXP-Complete). As a new way of solving these problems, we recently introduced a method for transforming a Dec-POMDP into a continuous-state deterministic MDP with a piecewise-linear a...
متن کاملOptimally Solving Dec-POMDPs as Continuous-State MDPs
Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general model for decision-making under uncertainty in decentralized settings, but are difficult to solve optimally (NEXP-Complete). As a new way of solving these problems, we introduce the idea of transforming a Dec-POMDP into a continuous-state deterministic MDPwith a piecewise-linear and convex value function...
متن کاملAn Iterative Algorithm for Solving Constrained Decentralized Markov Decision Processes
Despite the significant progress to extend Markov Decision Processes (MDP) to cooperative multi-agent systems, developing approaches that can deal with realistic problems remains a serious challenge. Existing approaches that solve Decentralized Markov Decision Processes (DEC-MDPs) suffer from the fact that they can only solve relatively small problems without complex constraints on task executi...
متن کاملProbably Approximately Correct MDP Learning and Control With Temporal Logic Constraints
We consider synthesis of controllers that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities. The solution we develop builds on the so-called model-based probably approximately correct Mark...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011